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Abstract —This paper concerns state estimation problems in a 
mean field control setting. In a finite population model, the goal 
is to estimate the joint distribution of the population state and 
the state of a typical individual. The observation equations are 
a noisy measurement of the population. 

The general results are applied to demand dispatch for 
regulation of the power grid, based on randomized local control 
algorithms. In prior work by the authors it is shown that local 
control can be designed so that the aggregate of loads behaves 
as a controllable resource, with accuracy matching or exceeding 
traditional sources of frequency regulation. The operational cost 
is nearly zero in many cases. 

The information exchange between grid and load is minimal, 
but it is assumed in the overall control architecture that the 
aggregate power consumption of loads is available to the grid 
operator. It is shown that the Kalman filter can be constructed 
to reduce these communication requirements, and to provide the 
grid operator with accurate estimates of the mean and variance 
of quality of service (QoS) for an individual load. 

I. Introduction 

Mean field models are a valuable tool for design and 
performance approximation for certain classes of interacting 
systems [l]-[3]. The infinite-population mean-field equations 
provide tremendous insight, but ultimately we must translate 
this insight to address a finite-population reality. In this paper 
we propose algorithms based on the Kalman filter to obtain 
estimates of first and second order statistics of the population 
and a typical individual, based on noisy observations of the 
population. 

While the potential applications are far broader than power 
systems, for ease of exposition it is convenient to restrict 
attention to one application. 

Renewable energy sources such as wind and solar power 
have a high degree of unpredictability and time variation, 
which complicates balancing supply and demand. One possible 
way to address this challenge is to harness the inherent 
flexibility in demand of many types of loads. 

Demand Response is traditionally meant as a reduction in 
load in response to some grid-level event. It is in use today 
for peak-shaving (smoothing demand), and for contingency 
reserves (load-shedding following generation loss). 

It is argued in [4]-[6] that the value of demand-side flexi¬ 
bility is far greater than this. Loads can supply a range of grid 

Research supported by NSF grants CPS-0931416 and CPS-1259040, and 
the French National Research Agency grant ANR-12-MONU-0019 

Y.C. and S.M. are with the Department of Electrical and Computer Engg. 
at the University of Florida, Gainesville. A.B. is with Inria and the Computer 
Science Dept, of Ecole Normale Superieure, Paris, France. 


services, such as the balancing reserves required at BPA, or 
the Reg-D/A regulation reserves used at PJM [5]. These grid 
services can be obtained without impacting quality of service 
(QoS) for consumers [3], [7]. This is only possible through 
design. The term Demand Dispatch, introduced in [8], is used 
to emphasize the difference between the goals of our own work 
and traditional demand response. 

The application in this paper concerns a large collection of 
loads whose power consumption is not continuously variable. 
Examples include thermostatically controlled loads (TCLs), as 
considered in [2], [9], and irrigation or pool-pumps [3], [6]. In 
these papers and [10], [11] it is argued that randomization at 
the load is valuable to avoid synchronization, and to simplify 
control at the grid level. 

In much of this prior work, a mean-field model is obtained 
for control design at the grid level - this is a deterministic 
model of the aggregate of loads, obtained as a law of large 
numbers limit as the population of loads tends to infinity. The 
control solution adopted in [2] is based on state-feedback for 
a linear state space model with partial observations. Although 
the mean-field model is bi-linear, it is represented as a linear 
model by treating the product of inputs and states as a new 
input; this is why state estimation is needed for implementation 
of the algorithm. In [6] a randomized policy is designed for 
each load so that the mean-field model is an input-output 
system that is easily controlled without the use of state 
estimates. 

For simplicity, in this paper attention is restricted to the 
setting of [6], in which each load evolves as a controlled 
Markov chain. The transition probability is determined by its 
own state, and a scalar signal Q broadcast from a balancing 
authority (BA). The extension to vector inputs, as in [2], 
requires only changes in notation. 

The common dynamics are defined by a controlled transition 
matrix {P^ : ( G K}. For the fth load, there is a state process 
JV* whose transition probability is defined by, 

Pdx-,x+) = P{A,Vi = x+ I XI = X-, Ct = Cl (1) 

where x~ and x^ are possible state-values. In the case of a 
water heater, the state x G X might represent temperature of 
the water, and whether the unit is operating or not. 

If there are N loads operating independently, conditional on 
the common signal then the empirical distribution (i.e., the 
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histogram of state values) is defined as the average, 

1 ^ 

=—'^I{Xl = x}, xGX 

' i=l 

Viewed as a row vector, the following recursion is central to 
the analysis in [3], [6]: 

= /rf Pc* + . (2) 

An observation model is also linear in the state. 

Ft = ^ {x)U (x) + Vt (3) 

X 

where W: X —?► K. In applications to demand dispatch, U{x) 
represents power consumption of a load when its state is x, 
so that '^he average power consumption at 

time t. 

It is established in [3], [6] that W := {Wt : f > 1} is a 
d-dimensional martingale-difference sequence (and hence un¬ 
correlated). The i.i.d. sampling model assumed in the present 
paper implies that V = {Vt '■ t > l}isa martingale-difference 
sequence that is also uncorrelated with W. 

Prop. 2.1 contains full details of this system description. 
The Kalman filter is developed in two settings, each with 
the same observation process: The first is constructed to obtain 
estimates of The second filter obtains estimates of the joint 
statistics of a larger state that includes both and the state 
of a typical individual. The main conclusions are summarized 
here: 

(i) A measurement architecture is proposed in which each 
load broadcasts its state only occasionally - say, once per 
day. The observation equations in the aggregate model 
then include white noise, whose conditional variance is 
computed. The state equations for the population/individual 
dynamics evolve as a linear stochastic system with white 
noise disturbance, whose conditional covariance matrix is 
also computable. 

(ii) In the examples considered, the observability Grammian 
is not full rank, and an approximate time-invariant model is 
also unobservable. Moreover, Prop. 4.1 demonstrates that 
a symmetry property (that holds in the example of [6]), 
implies that the model cannot be observable. 

However, in numerical experiments it is found that the 
Kalman filter remains valuable for reducing the impact 
of measurement noise, and for estimating the distribution 
of quality of service. In particular, estimates of certain 
first and second order statistics of an individual load are 
remarkably accurate, even though the measurements are a 
noisy sequence of samples from the population. 

(iii) In the face of un-modeled dynamics such as load het¬ 
erogeneity, or additional “opt-out” control used to enforce 
QoS bounds, the Kalman filter combined with PI control 
continues to perform nearly perfectly, even with 0.1% 
sampling of loads. 

There are in fact two general formulations of the Kalman 
filter. In the first, most typical setting, the sequence of Kalman 
gains is deterministic; obtained through a Riccati equation (a 
recursive equation driven by the covariance matrices for the 


state and observation noise). This is known to be L 2 -optimal 
over all estimators that are linear functions of the observations. 

A second formulation of the Kalman filter uses conditional 
covariance matrices to define the Kalman gain — see (22,23) 
and surrounding discussion. If the state/observation noise 
is conditionally Gaussian, then this Kalman filter coincides 
with the nonlinear filter, which is L 2 -optimal over all causal 
estimators [12]. Because the Riccati equation is a nonlinear 
function of the covariance matrices, this version of the Kalman 
filter may be a nonlinear function of the observations. 

The second is attractive because it is easy to compute 
formulae for the conditional covariance matrices, while the 
unconditional covariance matrices only admit approximations. 
Moreover, when considering the dynamics of the aggregate, a 
Gaussian approximation of the noise is justified by the Central 
Limit Theorem (CLT). 

Related research: In addition to the references cited 
above, there are many papers on demand dispatch based on 
centralized control, or relying on real-time prices to solve the 
control problem of interest. Much of the latter is closer to 
demand response, and has little intersection with the research 
summarized here. 

An application of this formulation of the Kalman filter 
was considered previously in [13] for a single Markov chain 
without control, with measurements subject to Gaussian error. 

There are several recent papers with similar goals in the lit¬ 
erature on mean-field models. Most closely related is [14], [15] 
which concerns partially observed LQG mean field games, 
with several classes of players. The state estimation problem is 
from the point of view of the individual - each “minor agent” 
obtains noisy and partial observations, and wishes to estimate 
the “major state” as well as the aggregate. The solution is 
obtained through the construction of a Kalman filter. This prior 
work is also motivated by application to power systems. 

The remainder of the paper consists of four sections 
organized as follows. The following section describes the 
stochastic model on which the estimation algorithms are based. 
Filtering equations are derived in Section III for a collection of 
controlled Markov models with partial observations. The algo¬ 
rithms have been tested in various different settings - results 
for demand dispatch using residential pools and also TCLs 
are summarized in Section IV. Conclusions and directions for 
future research are contained in Section V. 

Further results may be found in the dissertation [16]. 

II. Mean Field Model 

It is assumed throughout the paper that a family of Markov 
transition matrices : C G R} is given that is continuous 
in the parameter Q The finite state space is denoted X = 
{x^,..., x^}, so that each is a d x d matrix. 

The mean-field model is defined as the approximation of 
(2) obtained as N —i' oo. This is the deterministic recursion, 

d-t+i = (T) 

with po given, and where ^ is obtained via causal feedback. 
This paper is concerned with the stochastic system (2), but 
the steps used to justify the limit will lead to the second-order 
statistics required to describe the Kalman filter. 
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A. Aggregate dynamics 

The individual dynamics are described by the controlled 
Markov model (1). We lift the state space from the d-element 
set X = to the d-dimensional simplex S. For 

the load at time t, the element FJ € S is the degenerate 
distribution whose mass is concentrated at x if XI = x\ that 
is, T\ = 5x- These distributions evolve according to a random 
linear system, 

(5) 

in which FJ is interpreted as a d-dimensional row vector, GJ is 
a d X d matrix with entries 0 or 1 only, and G\{x^,x^) = 1 
for all j. 

The following assumptions are imposed throughout. 

Al; The input is defined by causal output feedback: For a 
continuous family of functions (pt ■ —>■ K, 

Ct = MYo,...,Yt), t>0. 

A2: For some function S with domain K x [0,1], and range 
equal to the set of d x d matrices, 

GJ = S(C*-i,ej), ( 6 ) 

where '■ t > 1, i > 1 } are i.i.d. on [ 0 , 1 ]“^ with uniform 
distribution. 

A3: The initial conditions {Fg : 1 < i < W} are i.i.d., with 

P{ro(a::) = 1} = fio(a:), xGX 

A4: The measurements are obtained through sampling: 
There is a bounded sequence {jt ■ t > 1} of N- 
dimensional vectors with non-negative entries, and inde¬ 
pendent of the {^t}, such that E[ 7 t(fc)] = N~^ for each t 
and fc, and 

N 

Yt=J2jSP{Xl). 

Moreover, the distribution of 7 is unchanged by permuta¬ 
tions of its N components. 

The first assumption is based on the control architecture 
assumed in all prior work. Assumption A2 is standard in 
this context [13], and can be assumed since any probability 
mass function (pmf) is a function of a uniformly distributed 
random variable. 

Assumption A4 can be used to model random sampling with 
or without replacement. Examples are given in Section IV. 
This assumption leads to the measurement equation (3); ad¬ 
ditional additive noise can be included in the observation 
equation, provided this measurement noise is independent of 
the sampling process. 

The filtration of observations is denoted, 

yt=CT{YrXr-r<t}, t>0. (7) 

Under A1-A3, Gl_^_i is conditionally independent of 
{F*,- • • ,Fj}, given Vt, with 

E[Gj+i I = ( 8 ) 


Two filtering problems are considered in this paper. In the 
first, the state is equal to the empirical distributions expressed 
as column vectors: 

4>t(A:) = {x^), \ <k < d, t>Q. (9) 

In the second, the goal is to estimate the state for an individual 
load. For the Ah load, this is denoted 

=T\{x^), l<k<d,t>Q. (10) 

The two state processes are evidently related, 

1 ^ 

i=l 

Proposition 2.1: The two state processes each evolve as 
linear systems, 

+ Wt+i ( 12 ) 

+ (13) 

where At = and for each i, t, 

1 ^ 

(14) 

k^l 

U7+1 = [Fj(Gj+i-PcJ]^ (15) 

The observation equation (3) can be written, 

Yt = G<^>t + Vt (16) 

where Gj = U{x^) for each j. 

Moreover, each of the sequences VU*, W, V is a martingale 
difference sequence, and the sequence V is also uncorrelated 
with each of the {TV*} and also W. 

Proof: The martingale difference property for each of the 
{TU*} (and hence also W), follows immediately from ( 8 ); see 
also [3]. The sequence V can be expressed, 

N 

Vt = -J2 Af ^ yti^PiXl) 

k i—1 

This has mean zero, from the exchangeability assumption in 
A4, and it is a martingale difference sequence because the 
sequence 7 is i.i.d. It is uncorrelated with {W*} and W under 
the assumption that { 7 ^} is independent of {^J}. ■ 

A derivation of the conditional state covariances is given in 
Section III-A. 

For a linear-Gaussian model, the Kalman filter equations are 
intended to approximate the conditional mean and covariance 
of the state. In the first model (12) they are denoted, 

= E[$t I yt ], Yt = E[$t$^ I yt] (17) 

with For large N it can be argued via the 

CLT that (V, W) is approximately conditionally Gaussian 
given the observations. We might expect the Kalman filter to 
approximate the optimal nonlinear filter in this case. 

In Section III-C the state space is extended to obtain 
estimates of QoS metrics for an individual load that may not 
be a function of the respective state 
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B. Example: Intelligent pools 

Here we focus on a single example in which each load is a 
residential pool pump. Section IV contains extensions to other 
loads. 

In the original model of [6], the state space is taken to be 
the finite set. 



X = {(to, fc) : m e {©, ©}, fce{l,...,©}} (18) 

where X > 1 is an integer. If XI = (©, k), this means that the 
pool pump is on at time t, and has remained on for the past 
k time units. In this paper we take the same state space, but 
with a new interpretation of each state. 



Fig. 1. State transition diagram for pool model. 


The controlled transition matrix is of the form. 


= {I - S)I + 6P^ (19) 

in which is the transition matrix used in [6], and 6 G (0,1). 
At each time t, a weighted coin is flipped with probability of 
heads equal to 6. If the outcome is a tail, then the state does 
not change. Otherwise, a transition is made from the current 
state a; to a new state x'^ with probability P(^^{x,x~^). 

A state transition diagram is shown in Fig. 1. The state 
transition diagram for Pq is identical, except that the self-loops 
are absent. 



Fig. 2. The deviation in power consumption tracks well even with only 20 
pools engaged. 


The motivation comes from conflicting needs of the grid 
and the load; a single load turns on or off only a few times 
per day, yet the grid operator wishes to send a signal far more 
frequently - In this example we assume every 5 minutes. If the 
sampling increments for each load were taken to be 5 minutes, 
then it would be necessary to take X very large in the approach 
of [6]. 

In this paper P(^ is obtained using the optimal-control 
approach of [6]; we take X = 48, and hence d = |X| = 96. It 
is assumed that i5 = 1/6, so that the pool state changes every 
30 minutes on average. In [6] it is shown that the transition 
matrix has desirable properties for control: the linearized 


Fig. 3. Eigenvalues for the observability Grammian for the pool model in two 
cases: The magnitude of eigenvalues decays rapidly for a typical sample-path 
of and for the LTI model obtained with (^ = 0. 


dynamics are minimum phase, with positive DC gain. Hence, 
for example, a persistent positive value of Q will lead to an 
increase in aggregate power consumption. 

For sake of illustration. Fig. 2 shows tracking performance 
of this scheme with only twenty pools. Each pool is assumed 
to consume 1 kW when operating, and each has a 12 hour 
cleaning cycle. The grid operator uses PI compensation to 
determine ^ (see Section IV-B for details). With such a small 
number of loads it is not surprising to see some evidence of 
quantization. For 100 loads or more, and the reference scaled 
proportionately, the tracking is nearly perfect. 

Two QoS metrics have been considered for this model. First 
is ‘chattering’ - a large number of switches from on to off. A 
large value means poor QoS, but this is already addressed 
through design of the controlled transition matrix [ 6 ]. The 
design of Pq also helps to enforce upper and lower bounds 
on the duration of cleaning each day. 

A second metric is total cleaning over a time horizon of one 
week or more. This is the QoS metric considered in [3]. In 
this paper we consider a discounted version; We assume that 
Pq has a unique invariant pmf ttq. With £: X —K a given 
function with zero steady-state mean, J2x'^oix)£{x) = 0 , we 
define for each i and t, 

t 

= ( 20 ) 

k=0 

with f3 G (0,1] a constant. The function £{x) = I(to = ©) — 
I(to = ©) was used in [3] in the case of a 12 hour cleaning 
cycle (recall the notation x = (to, k)). 

Experimental results surveyed in Section IV demonstrate 
that it is possible to estimate functionals of the state process 
such as the QoS metric {£{}, even if the observations are 
subject to significant measurement noise. 

In anticipation of the results to come we ask, what would 
linear systems theory predict with respect to state estimation 
performance? The observability Grammian associated with 
( 12 ) was computed for typical sample paths {At = : 

1 < t < 2016}, where the value 2016 corresponds to one 
week, and C, was scaled to lie between Its rank was 
found to be approximately 40, while the maximal rank is 96 
(the dimension of the state). With Ct = 0 the system is time- 
invariant. In this case the rank of the observability Grammian 
coincides with the rank of the observability matrix, which was 
found to be 23. 

However, these values were obtained using the “rank” com¬ 
mand in Matlab, which relies on finite numerical precision. 
A plot of the magnitude of the eigenvalues for the two 
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observability Grammians shown in Fig. 3 suggests that both 
matrices are full rank. 

Further analysis establishes that the LTI model obtained 
with C = 0 cannot be observable, due to a particular symmetry 
found in this example. A general result given in Prop. 4.1 
implies that A° = 0 for 50 < * < 96. 

III. Kalman Filter Equations 

The second order statistics for the disturbances appearing 
in the linear model (12, 13, 16) are derived here. These 
expressions are used to construct a Kalman filter that generates 
approximations for the conditional mean and covariance (17). 
Other statistics of interest are, 

= Ei = Em^Y\yt] 

, Et_|_i|t = E[(l>t+i(<i)t+i)’’|3^i] 

where again tildes represent deviations, such as — 

Prop. 3.1 states that some statistics of the individual can be 
expressed in terms of those of the population: It is not the 
case that SJ = NT,t, since {$( : 1 < i < TV} are correlated. 

Proposition 3.1: For each s, t, i, and any set S C the 
conditional probability is independent of i: 

e 5 I = P{$1 G 5 I (21) 

and consequently, E[<i)* | yt] = E[<i)s | yt\. Moreover, 
~ the state covariances for the individual are 

EJ= diag($t)-5t5l 

^t+i\t = diag ($t+i|t) - 
and the cross covariances can be expressed, 

I yt\ = Et+I|t, 1 < i < TV. 

Proof: The proof of (21) follows from the symmetry and 
independence conditions imposed in (A1-A4). The remaining 
results follow from this, and the fact that has binary entries 
[in particular, = diag($J)]. ■ 

Recall from the introduction that two formulations of the 
Kalman filter have been considered in this research. For a 
conditionally Gaussian model, the Kalman filter equations 
require the conditional covariances for the state noise, 

E^ = E[Wl+YWl+,y I j;*], Ef = E[Wt+iWl+, I yt] 

( 22 ) 

and also the conditional covariance of the measurement noise, 

Er = E[f}2 I y^_y (23) 

Formulae for the state noise covariances can be obtained in full 
generality. We require the distribution of the random vector 7 t 
introduced in A4 to obtain a formula for E]^. 

The Kalman filter that generates L 2 -optimal estimates over 
all linear functions of the observations uses instead the (un¬ 
conditional) covariance matrices, 

Er = Sr = E[Wt+iWl^y, (24) 


and = E[V'/] (the notation used in standard texts is 
Qt — E^ and Rt = [12]). We show in Prop. 3.2 that 

the two covariance matrices in (22) are linear functions of the 
true conditional mean $(. Expressions for the two covariance 
matrices in (24) follow from Prop. 3.2 and the smoothing 
property of conditional expectation, provided we can compute 
E[<i)t] = E[<i)i]. The formula we obtain for YY in Prop. 3.3 is 
a linear function of the conditional covariance matrices 
and E(_|_i|j. It is unlikely we can obtain formula for the means 

of these covariance matrices, and hence we do not expect to 

—V 

obtain an exact formula for E^ . 


A. State noise covariance 

The following result provides formulae for the conditional 
covariances for the state noise (22) as a function of the 
conditional mean 

Proposition 3.2: Under Assumptions A1-A4, 

= ^(diag(At$t) - Atdiag($t)^t) (25) 

Yf' = diag(At$j)-A*diag($j)A} (26) 


The second covariance is independent of i, with common value 


Yf' = NYY- 


Proof: Since {Wl : 1 < i < TV} is uncorrelated. 


N 




w' 

t 


(27) 


Z =1 


Moreover, Prop. 3.1 gives = 4>t, and from this or (11) it 
is obvious that 



i=l 


Consequently, (25) follows from (26). 

The derivation of the formula (26) for YY is similar to 
the Kalman filter construction in [13]. Given the larger sigma- 
field, 

y+ = y,, Cr,Ar-.r <t, i< n} 


the smoothing property of conditional expectation implies, 

Yr = E[E[wi^ywi^,y\yY]\yt] 

The inner conditional expectation is transformed using the 
definition (15): 

E[wi_,ywi_,,y I yt] 

= \yt] 

- E[$j+i I 3^+]E[<I>j+i I y+Y 

= diag(At$t) 

-At diag($*)A} 

where the final equation uses E[$]_|_j^ | yt] = At^f and the 
fact that 4)} has binary entries for each i and r. 

Taking the conditional expectation given yt gives (26). ■ 
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B. Sampling and observation covariance 

The observation model used in the numerical experiments 
that follow is based on random sampling of loads: An integer 
n < N is held fixed, and at each time instant t a distinct set 
of n indices {ki,... ,kn} is chosen uniformly at random. The 
observation is the average 

n 

Yt = -y^U{X^'). (28) 

Z =1 


Recalling the definition U{X^') = C'T'jS it follows from 
Prop. 3.1 that this is an unbiased estimate of C$t. An 
expression for the conditional variance defined in (23) 
is obtained next. 

Proposition 3.3: The conditional covariance is given by, 


E 


V _ 
t+i — 


1 — n 

n N — 1 


C 




Proof: By definition. 


yV 

^t +1 


= E 


[(^ 


2=1 


yt 


where the permutation is random, uniform, and independent 
of yt- It follows from symmetry of the model that we can 
consider just the first n samples, replacing ki by i. We also 
center the random variables to obtain 


^t+i 


= E 


n 

VnYoi 


t + l\t 


- 


t+llt 


yt 


The quadratic is expanded as the sum of three terms, 

- 252 + ^3 


(29) 


(30) 


in which the second two terms are transformed using the 
following consequence of Prop. 3.1: 


IJ;,] =CE 

The first term is the conditional variance, 

5i :=E[(C$i+i|t)' =C'E*+i|tC^ 

The second term is transformed using (31): 

2=1 

= C-E.+ntC^ 

The third term is another conditional variance, 

2 


So:=E 


1 ^ 


t+l\t 


yt 


53 :=^ 


i=l 


yt 


(31) 

(32) 

(33) 




* j'=i 

Applying symmetry once more gives. 


= nE[{C$,Vi|t}'l 3 ] 

+ - n)E[{C$,V|J{C'$?+i| J I 


Using familiar arguments, and applying (31), 

(iv-i)E[{c$,Vi|J{E^5EiJlE 

N 

= ^e[{c5*ViJ{C'5WI^ 

fc =2 

= iVE[{c5,V|J{C$*+i|J|3;*] 
- E[{C'$i+3|J^ I 3 ;*] 

= WCE,+3|,C'^-CEJ+i|,C^ 
Putting these expressions together gives. 


S 3 = i-Ci nE: 


n — n 


t+l|i ' 




CT 


N 

The desired expression is obtained by substitution of these 
terms in (30). ■ 


C. Estimation of the individual 

A filter is introduced here to estimate the mean and variance 
for an individual state and the mean and variance of the 
QoS metric C\ for a typical load (recall (20)). Estimates of the 
individual state and individual QoS can be used to estimate the 
flexibility of loads, which may vary with time. For example, 
if the grid operator believes that every water heater contains 
water that is too cold, then it is unlikely that there is much 
remaining flexibility to reduce power consumption from these 
loads. 

The Kalman filter must be modified to obtain estimates of 
the QoS for individual loads. One approach is to restrict to 
/? = 1, and use the Markovian model {XI, Cl). This comes 
with high complexity: In the case of the pool model, the first 
component can take on d = 96 values, and the range of values 
of Cl may be over 100. This means that the state space is 
at least 10 "^, which is probably too large to be practical for 
estimation. 

Here we introduce a Kalman filter for the joint process T'J = 
{^l; <i>t; Cl), which is of dimension 2d + 1. The construction 
of a linear model for fP* is based on (12, 13), and the one¬ 
dimensional dynamics for QoS, 

Ej+i = pCl + 

where Ci^l = ({XI). The measurements remain of the form 
(3), which is why it is necessary to include in the definition 
of T'j. The new A, C, state-noise covariance matrices are. 


'At 

0 

o' 


r sr’ 

iv-isr’ 

o' 

0 

Ct 

At 

0 

0 

p 

, [ 0 , C, 0 ], 

”0 

1 

_ 1 

0 

- 1 

0 0 


We thus have all of the system parameters needed to construct 
the Kalman filter. 

IV. Estimating QoS and Closing the Loop 

We have conducted experiments in various settings, using 
both formulations of the Kalman filter, differentiated by the 
use of conditional or unconditional covariance matrices. For 
estimating performance we obtained excellent results with the 
Kalman filter whose gain is based on the conditional covari¬ 
ance matrices. The numerical results focus on the residential 
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pool model, and closes with preliminary extension to TCL 
models. 

The Kalman hlter that is optimal over all linear estimators 
requires unconditional covariances. The state covariance ma¬ 
trix can be expressed as the mean of (25); 


=^w 



diag (At^t) 


At diag ($t)Al 


The mean E[<i)i] = E[<i)i] does not have a closed form 
expression, so we make two approximations. First, we consider 
the mean with (^ = 0, and second we let t ^ oo. The 
covariance used in the hlter is the resulting limit. 


—w 

■^OO 


1 

N 


diag {AtTTo) - At diag (Tro)^!^ 


(34) 


where ttq is invariant for Pq. A similar approximation is used 
for 

The common features in all of the numerical experiments 
that follow are listed here; The reference signal r was gen¬ 
erated from the BPA balancing reserves deployed in the hrst 
month of 2015 [17]. This signal was low pass hltered, and 
then scaled to an appropriate magnitude to match the capacity 
of the aggregate of loads, exactly as in [6]. 

The observation model was based on the sampling assump¬ 
tions introduced in Section III-B, in which a hxed number of 
loads n < N is sampled at each time. Recall from (28) that 
the average of these n samples at time t is denoted Yt. Unless 
stated otherwise, n is taken to be 0.1% of the total population. 
Hence, for a population of = 10, 000 pools, exactly 10 are 
sampled at each time step. 

It is assumed that each pool pump consumes IkW while in 
operation. The actual power consumption at time t is NC^f 
For this reason, plots of or Yt = C^t are scaled by N 
to represent total power consumption or its estimate. 

Finally, the discount factor f3 in the QoS metric (20) was 
chosen as the value for which /3^' « 1/2 when k corresponds 
to one week. The value f3 = 0.9997 is obtained under the 
assumption of five-minute sampling. 


A. Estimation of QoS 

The Kalman hlter described in Section III-C was used to 
obtain estimates of the mean and variance of the QoS metric 
C\, with ^{x) = Cx — y. These experiments were performed 
with ^ equal to the exogenous “mean-held limit” used in the 
linearized model of [3]; 

. Gc 

i + g,g/* 

where is a PI compensator, and Gp is a transfer function 
for a linearized model (see [3] for details). 

Fig. 4 shows estimates of QoS based on a model with 10,000 
pools, with 0.1% sampling rate. The accuracy of estimation is 
remarkable, in spite of signihcant measurement noise. 

Non-ideal settings were also considered. In one experiment 
a population of 20,000 pools was divided into two classes 
of equal size; 10,000 pools operated as previously with a 
12 hr/day cleaning cycle, and the other 10,000 operated with 
a 8hr/day cleaning cycle. It is possible to construct the exact 



t/hours 


Fig. 4. Although the liner model is not observable, estimates of the mean 
and variance of QoS for an individual are nearly perfect. 


Kalman hlter by doubling the dimension of the state space. 
Instead, an approximate model was constructed in which the 
parameters in the two state space models (differentiated by 
cleaning cycle) were averaged; similar approximate models 
are used in [2]. 

Estimation results for the case of two pool classes are shown 
in Fig. 5. The QoS for the two classes of pools have different 
means and variances. This difference cannot be captured using 
this hlter which is designed to estimate the overall mean and 
variance. The hlter generates estimates of the mean QoS that 
lie between the two empirical means, and estimates of its 
variance follow more closely the empirical variance of the 
pools with 12hr/day cleaning cycle. 



t/hours 


Fig. 5. Estimates of the mean and variance of QoS for an individual with a 
heterogenous population. The un-modeled dynamics lead to some error. 


B. Control of the heterogeneous population 

Closed-loop experiments were conducted in which un¬ 
modeled dynamics are present due to both load heterogeneity 
and additional local control. With large un-modeled dynamics 
it is still possible to obtain estimates of the mean of QoS, but 
the Kalman hlter cannot be expected to provide estimates of 
the variance. 

Error feedback was used in all of the remaining experiments, 
Q = GcCt- A PI compensator was used to dehne Gc- A 
proportional gain of 50, and integral gain of 1.5 worked well 
in all examples. These values were based on a nominal LTI 
model — this feedback design resulted in a phase margin of 
about 70 degrees, and gain margin of about lOdB. 

In the experimental results illustrated in Eig. 6, the error 
signal was dehned with respect to the raw measurements. 
















without use of the Kalman hlter. In all other experiments, the 
following dehnition was used; 

et = l^n-Yt, Yt = C^t-y (35) 

with y equal to the average nominal power consumption. The 
scaling of the reference by N~^ is required since (7$* is an 
estimate of average power consumption. 

The total number of pools was taken to be iV = 300, 000. 
Performance results using smaller values of N are discussed 
at the close of Section IV-B, and also in Section IV-C. 

The simulation model used in the remaining experiments 
consisted of nine different classes of pools, distinguished 
by cleaning cycle. The distribution of classes is shown in 
Table I. Suppose that the grid operator had full information 
regarding the number of pools in each class. The full state 
space description would have dimension 9 x d, which is far 
too complex. Moreover, in practice the grid operator will not 
have complete system information. 

Here we make some coarse approximations to simplify the 
model. In addition to reducing complexity of the hlter, our 
goal is to investigate robustness of the estimation algorithms 
in a closed-loop setting. 

Table I: Distribution of pools 
Cycle (hr/day) 20 18 16 14 12 10 8 6 4 


Number (xlO^) 1 1 1 3 5 10 5 3 1 

Although the simulation uses these nine classes of pools, 
for the purposes of hlter design an approximate model was 
constructed based on just two pool models, corresponding 
to 8 and 12-hour cleaning cycles. Following [2], a convex 
combination was chosen for the state matrix. 

At = aAt -f (1 - a)Al'^ 

in which 0 < a < 1 is independent of t, and the superscripts 
® and refer to the respective pool classes. 

The parameter a was chosen to be consistent with the hy¬ 
pothesis that the collection of pools consist of just two classes, 
with 8 or 12 hr/day cycles. It is assumed that the nominal 
average power consumption y is known to the grid operator - 
this can be estimated easily from weekly measurements. The 
2 -class assumption would imply that y = ay^ -f (1 — 
giving a = - y®), with y^^ = x/2 and 

y® = 1/3. 

The covariance matrix used in the Kalman hlter was 
modihed to take into account the large un-modeled dynamics. 
It was chosen as the sum of two terms, 

Sr = yY' + h^Z, (36) 

where kt > 0, YY* is a convex combination of the matrices 
‘ -w 

computed in Prop. 3.2, and E^^ is a convex combination of 
the matrices (34). This structure ensures that E/^ is positive 
semi-dehnite, and that YY 1=0 (consistent with the fact that 
our state evolves in the simplex). The performance of the hlter 
was not very sensitive to the parameter kt, but the best value 


was found to grow linearly with population size. The results 

that follow used kt = iV/300 (note that N cancels the factor 

—tu 

1/A^ appearing in the dehnition (34) of E ). 



Fig. 6. PI control with output-error feedback, with et = N ^rt — Yt. 


With the experimental setting fully described, we now 
survey results from several control experiments. 

We hrst show what happens if we forgo use of the Kalman 
hlter, and take e* = N~^rt — Yt, with Yt = Yt — y, and Yt 
the noisy measurements obtained with 0.1% sampling. Results 
from one experiment are shown in Fig. 6 . The volatility of 
the measurements resulted in similar volatility of input Q- 
This drove the pools to switch frequently, and also resulted 
in degraded grid level tracking. This motivates the use of a 
hlter to smooth the measurements. 

Results using the hltered error (35) are shown in Fig. 7. 
This Kalman hlter estimator based control gives nearly perfect 
tracking performance, in spite of the signihcant modeling 
error. 



t/hour 

Fig. 7. PI control with smoothed error signal (35). 

Tracking experiments were conducted in many other sce¬ 
narios, distinguished by sampling rate and load population, 
using the same distribution of load classes given in Table 
1. Normalized tracking error was chosen as the performance 
metric. 

The full relationship between population size N, sampling 
rate, and closed loop performance is illustrated in Fig. 8 . The 
vertical axis is the root mean square (RMS) of the tracking 
error {e"}. For each of the 4 values of N considered, the 
tracking error is less than 3.5% with 100% measurements. It is 
found that the tracking error is best predicted by the number of 
samples per time-step. Consequently, a smaller population of 
loads requires a larger sampling rate to maintain good tracking 
performance. 
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Closed-loop tracking 

-Output deviation 

.Reference 

0.5 i 

O 

100 110 120 t/hour ^^0 ' 100 110 120 ^/hour ^3° 

Fig. 9. PI control and local opt-out control for the heterogeneous model with 0.1% sampling rate. Tracking performance improves as N is increased. 




Fig. 8. Tracking performance improves with increased sampling rate, or a 
lai'ger population of loads N. 

C. Opt-out local control 

In [3] an additional layer of control is introduced, based on 
the QoS metric C\ introduced in (20): The *th load can opt-out 
of service to the grid (ignore the signal Q) at any time t for 
which Cl lies outside of pre-assigned bounds. The impact of 
these additional un-modeled dynamics is investigated here. 

At each load the QoS metric Cl is constrained to a common 
interval [—50, +50]: if Cl > 50, then the pool is turned off at 
time t, regardless of the value of Q. Similarly, if Cl < —50 
then it will be turned on. 

In the construction of the Kalman hlter, the state noise 
covariance was taken as (36), but in this case the scaling factor 
was chosen as a function of the QoS estimates, kt = k-\-b\Cl\. 
This is chosen to reflect the fact that un-modeled dynamics 
increase with the percentage of pools that opt-out. The values 
k = V/100 and b = 300 worked well, and the sensitivity to 
these values was not high. 

Closed loop performance remained nearly perfect, and the 
percentage of opt out loads was observed to remain under 
0.5% over the entire run. Fig. 9 shows results from two 
experiments with N = 300,000 and N = 30,000, each with 
sampling rate maintained at 0 . 1 %. 

It is seen again that reducing the number of loads results in 
degradation of closed loop performance, since fewer loads are 
sampled at each time-step; the two plots are nearly identical if 
the sampling rate is increased to 1 % for the smaller population. 

D. Observability in Demand Dispatch systems 

It is shown here that the lack of observability observed in 
the pool model with 12 hr cleaning cycle can be attributed to 
symmetry of the model. 

A review of linear systems terminology is required. Con¬ 
sider the general LTI state-space model, 

‘&t+i = + But^ yt = C^t 

where A is a d x d matrix, i? is a d-dimensional column vector, 
and C is a d-dimensional row vector. The observability matrix 


is the d X d matrix whose kth row is equal to and its 

null space Sg is called the unobservable subspace. Suppose 
that {$t} and {‘hj} are two solutions to the state space 
equations with common input sequence {ut}, but different 
initial conditions, ^ ~ C Sg, then for 

each f > 0 , 

j/t - = CA‘($o - $o) = 0 

That is, based on measurements {yt : f > 0}, it is impossible 
to know if the true state sequence is {$*} or {$(}. 

For the time-varying system 

+ But, yt = C^t 

the observability matrix is replaced by the observability Gram- 
mian, Oq- The definition can be found in any introductory 
text on state space control (or, see the encyclopedic text [ 12 ]). 
A critical interpretation is this: If ‘J’o are two initial 
conditions, then with identical inputs 

OO 

($0 - <i>o)"OG($o - $o) = Y.^yt - y'tf 

t=0 

This is why the small eigenvalues observed in Fig. 3 are a 
concern if we wish to reconstruct the entire state based on the 
observations {yt}- 

The proposition that follows concerns a class of LTI models 
that have the following symmetry property: 

A — 

~ A, 

where A is a d x d matrix, and the decomposition is in terms 
of two square matrices Ag, Ag. This requires that the integer 
d is even. 

An example is the pool pump model with 12 hour cleaning 
cycle. Under certain conditions, a TCL model may be sym¬ 
metric following a state transformation - an example is given 
below. Regardless of the details of the model, symmetry rules 
out observability. 

Analysis of the symmetric model is based on the subspaces, 

)2 = {u G : Ui = Vd/2+i, 1 < * < d/2} 

Vo = {v€V:Vv = 0} 

Proposition 4.1: Consider the symmetric LTI model (37). 
If is a transition matrix, then the subspace Vq is contained 
in the unobservable subspace. Consequently, the symmetric 
model is never observable, and the rank of the observability 
matrix is no greater than d/2 + 1. 


C = [T I OT (37) 
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Proof: To prove the proposition we establish that 
A: Vo —i" Vo, so that A^v € Vo for any fc > 1 and any 
V G Vo- By dehnition we also have Cv = 0 for i; G Vo¬ 
lt follows that CA^v = 0 for each k when v G Vo, so that 
Vo C Eg as claimed. 

The subspace V is equal to vectors of the form = (z^ \ 
z"^), with z G and Vo is the subset satisfying ^ Zj = 0. 
Symmetry implies the identity 

Av = {w^ I wT 

where w = {Ag + Ao)z. This shows that A: V ^ V. 

To complete the proof, we next show that ^ Wi = 0 
whenever ^ z^ = 0, so that A: Vo —>■ Vo- 

Since A'^ is a transition matrix, it follows that 

V=VA=[VAg + VAo\VA, + VAg] 

That is, Q = {Ag + AoY is a transition matrix: Ql = 1. 
Consequently, if z’^l = = 0, then 

Vw = {Ag + Ao)z = z’^Ql = 0 

■ 

Consider the following example in which the dimension of 
the symmetric state space model is d = 8, 

i-d 0 0 o' 

021 1 - d 0 0 

0 032 1 — d 0 

0 0 043 1 — 5 

015 Ol6 ®17 5 

0 0 0 0 

0 0 0 0 

0 0 0 0 

Based on the proposition, the rank of the observability matrix 
is no greater than 5. 

The following values are consistent with the pool pump 
model with 12 hour cycle considered previously: 5 = 1/6, 
021 = 0.1654, 032 = 0.0840, 043 = 0.0026, 015 = 0.0013, 
016 = 0.0827, 017 = 0.1641. The 8x8 observability matrix O 
was obtained using Matlab, and the null command results in 
equality Eg = Vo- However, Matlab also returns these values 
for the log of the modulus of the eigenvalues: {log |Ai|} = 

{1.4,0.095, -1.8, -5.8, -19.7, -35.5, -36.5, -37.8} 

Based on these values, it is not clear if the rank of O is equal 
to 4, 5, or 8. 



0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2 

Time/hour 


Fig. 10. State transformation for the TCL model: the evolution of Z is 
similar to the pool model. 


Ag — 


and An = 


Consider next a TCL that provides cooling (a refrigerator or 
air-conditioner), with temperature 9t constrained to the dead¬ 
band [0™”,©™“]. A new state process {Zt^rrit) is obtained 
in which mt is the power state as before, and 

^ if mt = 0 

‘ “ [0™ + 0‘"“-6»t ifmt = l 

A typical trajectory for an air-conditioning unit shown in 
Fig. 10 reveals that the sample paths are very similar to the 
pool model. If the Markovian dynamics of Z are symmetric, 
then the LTI model is not observable. 


E. Eigenvectors and reduced order observer 

The structure of eigenvectors and reduced order observers 
are described for both the pool model and a TCL model. 

Observability of the pool model: We return to the homo¬ 
geneous model with 12hr/day cleaning cycle. The observabil¬ 
ity Grammian for the time-varying linear system was found 
to be highly ill-conditioned in all of our experiments - one 
example is illustrated in Fig. 3. The LTI model obtained with 
^ = 0 is not observable, by an application of Prop. 4.1. 



' Pool is on ' Pool is off 

Fig. 11. First four eigenvectors of A for the pool model. 


Fig. 11 shows four eigenvectors of A (left eigenvectors of 
Pq), ordered with respect to the magnitude of the correspond¬ 
ing eigenvalues. To avoid redundancy we considered only 
eigenvalues with non-negative imaginary part. The eigenvector 

has real eigenvalue Ai = 1: it is proportional to the invariant 
pmf ttq, expressed as a column vector. The remaining three 
eigenvalues are complex, as are the corresponding eigenvectors 
{v^, v^, u^}. 

A reduced-order observer was constructed using a state 
transformation based on the LTI model. The construction can 
be interpreted as a truncated representation of the state with 
respect to the eigenvectors of A. In view of the sinusoidal form 
shown in Fig. 11, this approximation is similar to a Fourier 
approximation. 

After several experiments it was found that a 7th order 
approximation ht the dynamics well: the Bode plots nearly 
agreed, and the reduced order observer tracked the true output 
U{Xf) nearly as well as the full order observer. 

Experiments were conducted for the model described in 
Section IV-A, with a 12hr/day cleaning cycle, and a sampling 
rate of 0.1%. Fig. 12 shows a comparison of the state and its 
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Fig. 12. The reduced-order observer yields good estimates of the state $£ 
for typical state values. 


estimate at a particular time in two simulations, distinguished 
by load population. 

The empirical distribution is approximately the same in 
each case, N = 10^ and N = 10®; it is slightly smoother for 
the larger population. As seen in the figure, the state estimates 
are indistinguishable in the two cases. The state is far from 
the steady-state (proportional to shown in Fig. 11), yet the 
estimates are good in each experiment. The explanation for 
success is most likely the similar continuity of the empirical 
distribution and the first seven eigenvectors. 

Observability of a TCL model: The pool example is 
similar to the TCL model considered in [2] and several other 
recent papers. The controlled Markov model for the ith load 
again consists of two parts: a binary component indicating if 
the unit is on or off, and a continuous component indicating 
its temperature. The state of the ith load at time t is denoted 

XI = 

The complete system description follows [2]: For the nom¬ 
inal dynamics considered in prior work, this is defined by a 
dead-hand For cooling devices. 


H+l 


0 , < 0 ™" 

1 , > 0 ”“ 

ml, otherwise 


The temperature is modeled as a linear system driven by white 
noise: 


= a^ei + (1 - a*)( 0 ° - + vl, 

in which 0 < a* < 1. 

The parameters summarized in Table II are taken from [2]. 
In this prior work a heterogeneous model was considered, in 
which the values were sampled from a probabil¬ 

ity distribution. A homogeneous model is considered in the 
experiments surveyed here, in which i?* = C* = 2 for each i. 


Table II: TCL Parameters 


Parameter 

Value 

r: time step (secs) 

2 

J0mm Qmaxj: temperature deadband (°C) 

[20, 21] 

0°'. ambient temperature (°C) 

32 

R: thermal resistance (°C/kW) 

2 

C: thermal capacitance (kWh/°C) 

2 

rj: modeling noise (°C) 

~ iV(0,2.5 X 10"'^) 

energy transfer rate (kW) 

14 

a* = exp{—T/(Ci?)} 



The state space for the TCL model is continuous. A finite 
state-space approximation is obtained by binning the dead¬ 
band interval into 40 bins of equal size. A Markov chain model 
is desired with a total of 80 states, since the state captures 
temperature as well as whether the load is on or off. 

An empirical model was obtained via Monte-Carlo. Sim¬ 
ulation experiments with 10,000 TCLs and 3,600 time steps 
were run to obtain an approximation of the steady-state joint 
distribution, 

= P{Xt = xf Xt+i = x''}, I < j,k < 80. 

The approximate transition matrix is then obtained using 
Bayes’ rule: 

Poixfx'") = ^ ^ , with 7r(x*) = Vn(x^x'"). 

The approximate Markovian dynamics are used in [2] to 
define the heuristic mean-field model, 

{Akj} = {Po{xfx'‘)}. 

The observation matrix used in [2] is the same as in the pool 
example, except for a scaling: 

Ck = 10® X X IjState k is ON}. 

The observability Grammian for this LTI system is highly 
ill-conditioned: Fig. 13 shows that the rapid decay of the 
eigenvalues is similar to what was observed in Fig. 3 for the 
pool model. 



Fig. 13. Magnitude of eigenvalues of observability Grammian for TCL model 

The structure of eigenvectors is also similar. Fig. 14 shows 
four eigenvectors of the matrix A, corresponding to the four 
eigenvalues of maximum magnitude (ignoring duplication 
from complex conjugate pairs). 



Eig. 14. First four eigenvectors of A for a TCL model. 

A finer analysis of the continuous time / continuous state 
model for TCLs (as considered in [11], [18]) may give greater 
insight on the eigenstructure observed for these models. 
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V. Conclusions 

The construction of a Kalman filter for the joint popu¬ 
lation/individual dynamics is possible, and performance is 
remarkable in the test cases considered. In particular, it is 
very surprising to obtain accurate tracking of both the mean 
and variance of QoS for an individual, given extremely noisy 
estimates of the population. 

An open topic for future research is the state estimation 
techniques that take into account opt-out, which is used to 
ensure good QoS to loads [3]. It may be possible to obtain 
a reduced order observer for the very complex model for 
Joint state-QoS dynamics. Alternatively, it may be possible 
to exploit the structure of the distribution of QoS observed in 
numerical experiments. When subject to opt-out control, the 
histogram of QoS is similar to a conditional Gaussian that is 
constrained to the interval specified by opt-out parameters. A 
nonlinear filter may be constructed that takes this structure 
into account. 
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